228 research outputs found
Expectation Propagation on the Maximum of Correlated Normal Variables
Many inference problems involving questions of optimality ask for the maximum
or the minimum of a finite set of unknown quantities. This technical report
derives the first two posterior moments of the maximum of two correlated
Gaussian variables and the first two posterior moments of the two generating
variables (corresponding to Gaussian approximations minimizing relative
entropy). It is shown how this can be used to build a heuristic approximation
to the maximum relationship over a finite set of Gaussian variables, allowing
approximate inference by Expectation Propagation on such quantities.Comment: 11 pages, 7 figure
Probabilistic Interpretation of Linear Solvers
This manuscript proposes a probabilistic framework for algorithms that
iteratively solve unconstrained linear problems with positive definite
for . The goal is to replace the point estimates returned by existing
methods with a Gaussian posterior belief over the elements of the inverse of
, which can be used to estimate errors. Recent probabilistic interpretations
of the secant family of quasi-Newton optimization algorithms are extended.
Combined with properties of the conjugate gradient algorithm, this leads to
uncertainty-calibrated methods with very limited cost overhead over conjugate
gradients, a self-contained novel interpretation of the quasi-Newton and
conjugate gradient algorithms, and a foundation for new nonlinear optimization
methods.Comment: final version, in press at SIAM J Optimizatio
Optimal Reinforcement Learning for Gaussian Systems
The exploration-exploitation trade-off is among the central challenges of
reinforcement learning. The optimal Bayesian solution is intractable in
general. This paper studies to what extent analytic statements about optimal
learning are possible if all beliefs are Gaussian processes. A first order
approximation of learning of both loss and dynamics, for nonlinear,
time-varying systems in continuous time and space, subject to a relatively weak
restriction on the dynamics, is described by an infinite-dimensional partial
differential equation. An approximate finite-dimensional projection gives an
impression for how this result may be helpful.Comment: final pre-conference version of this NIPS 2011 paper. Once again,
please note some nontrivial changes to exposition and interpretation of the
results, in particular in Equation (9) and Eqs. 11-14. The algorithm and
results have remained the same, but their theoretical interpretation has
change
Probabilistic Solutions to Differential Equations and their Application to Riemannian Statistics
We study a probabilistic numerical method for the solution of both boundary
and initial value problems that returns a joint Gaussian process posterior over
the solution. Such methods have concrete value in the statistics on Riemannian
manifolds, where non-analytic ordinary differential equations are involved in
virtually all computations. The probabilistic formulation permits marginalising
the uncertainty of the numerical solution such that statistics are less
sensitive to inaccuracies. This leads to new Riemannian algorithms for mean
value computations and principal geodesic analysis. Marginalisation also means
results can be less precise than point estimates, enabling a noticeable
speed-up over the state of the art. Our approach is an argument for a wider
point that uncertainty caused by numerical calculations should be tracked
throughout the pipeline of machine learning algorithms.Comment: 11 page (9 page conference paper, plus supplements
Probabilistic Line Searches for Stochastic Optimization
In deterministic optimization, line searches are a standard tool ensuring
stability and efficiency. Where only stochastic gradients are available, no
direct equivalent has so far been formulated, because uncertain gradients do
not allow for a strict sequence of decisions collapsing the search space. We
construct a probabilistic line search by combining the structure of existing
deterministic methods with notions from Bayesian optimization. Our method
retains a Gaussian process surrogate of the univariate optimization objective,
and uses a probabilistic belief over the Wolfe conditions to monitor the
descent. The algorithm has very low computational cost, and no user-controlled
parameters. Experiments show that it effectively removes the need to define a
learning rate for stochastic gradient descent.Comment: Extended version of the NIPS '15 conference paper, includes detailed
pseudo-code, 59 pages, 35 figure
DeepOBS: A Deep Learning Optimizer Benchmark Suite
Because the choice and tuning of the optimizer affects the speed, and
ultimately the performance of deep learning, there is significant past and
recent research in this area. Yet, perhaps surprisingly, there is no generally
agreed-upon protocol for the quantitative and reproducible evaluation of
optimization strategies for deep learning. We suggest routines and benchmarks
for stochastic optimization, with special focus on the unique aspects of deep
learning, such as stochasticity, tunability and generalization. As the primary
contribution, we present DeepOBS, a Python package of deep learning
optimization benchmarks. The package addresses key challenges in the
quantitative assessment of stochastic optimizers, and automates most steps of
benchmarking. The library includes a wide and extensible set of ready-to-use
realistic optimization problems, such as training Residual Networks for image
classification on ImageNet or character-level language prediction models, as
well as popular classics like MNIST and CIFAR-10. The package also provides
realistic baseline results for the most popular optimizers on these test
problems, ensuring a fair comparison to the competition when benchmarking new
optimizers, and without having to run costly experiments. It comes with output
back-ends that directly produce LaTeX code for inclusion in academic
publications. It supports TensorFlow and is available open source.Comment: Accepted at ICLR 2019. 9 pages, 3 figures, 2 table
- …